Formula 1 (F1)1 is the highest class of international motor sport racing event sanction and organised by the Fédération Internationale de Automobile (FIA). A Formula One season consists of a series of races called Grand Prix which are organised on race circuits all over the world. Every season a driver earns points and the sum of points at the last grand prix is used to determine the world champion for the year.
Since 1950s there has been immense technological advancement is the field of motor vehicles. Formula One constructors lead the design and innovations of better, faster and efficient automobile engineering products. Formula 1 cars are the fastest road cars to exist with the ability of having high cornering speeds achieved through generation of large amounts of aerodynamic downforce generated by the shape of car and the massive rear wing. Naturally driving a Formula One car is a physically challenging job which makes racing with the car an extreme competitive sport.
Formula 1 championship consists of 10 teams with 2 drivers each team. This means that those 20 drivers are the best of the best drivers in the world. With everyone possessing a car with very similar specifications, the winning depends on driving skills, the quality of car manufacturing and a bit of luck. With such high specification cars, the race is extremely competitive with lap times between drivers measured with an accuracy of a millisecond. The lap time difference between first 5 cars is usually within a few milliseconds, that’s how competitive F1 race gets.
This analysis is going to focus on one specific race, i.e., the Abu Dhabi Grand Prix in 20212 the last race of the season.The race at this circuit consisted of 58 laps. This was a very important race in the Formula 1 history because this 1 race decided the fate of 2 drivers.
Lewis Hamilton is a 7 time Formula 1 World Champion. He has won the world championship 7 consecutive times. If we wins this race then he will be the world champion for 8 consecutive times beating the world record of another legendary driver Michael Schumacher who also has 7 world championships under his belt.
Max Verstappen a very young driver in his 20s and has never won a world Championship before. But his amazing performance in this season meant he was the contender for the Championship battle. Coincidentally, both Lewis Hamilton and Max Verstappen had 369.5 points coming into the race. This meant that this race was the decider race for the championship battle. Whoever wins the race is going to be the next world champion.
There was a controversy associated with the Abu Dhabi Grand Prix. One of the drivers crashed in the 55th lap of the race. Upto this point Lewis Hamilton was leading the race at 1st position and Max Verstappen was in the 2nd position for the entire race. When there’s a crash a safety car is deployed in Formula 1 racing. A safety car is car which drives ahead of all the drivers in a controlled pace while the race crew clears the track of crash debris. One of the important rule of safety car is no driver is allowed to overtake other driver while the safety car is out, as the race is at a halt.
Lewis Hamilton and Max Verstappen were so fast that they had caught up with cars still completing lap 56 while they were in 57th lap. But in this lap safety car was out so no one was allowed to overtake these lapped cars. So in lap 57 there was 3 cars in between Lewis Hamilton and Max Verstappen who were “outlapped” i.e., they were lagging 1 lap. The then race director Michael Massi allowed those 3 outlapped cars to overtake the safety car! This was clearly out of the rules and illegal move. But this helped Max Verstappen to gain places towards Hamilton.
Now the situation is extremely intense as just 1 lap of racing is left and both world championship contenders are in 1st and 2nd position. Max Verstappen overtakes Hamilton in the last corner and wins the Abu Dhabi Grand Prix 2021 winning the World Championship title for the first time.
This analysis is to determine that if those 3 cars were not allowed to overtake the safety car, was there a chance of Max Verstappen winning the race. Past performance is the best indicator to determine the chance of success and we will use the 2021 season statistics to evaulate both drivers performance.
The Ergast Developer API3 is an experimental web service which provides a historical record of motor racing data for non-commercial purposes. This data set contains detailed information about every driver, every race, every lap of the race since 1950s. The dataset can be obtained at this webpage and this is the link to download the zip file of all the csv files. This dataset consists of 14 csv files which are interconnected by identifiers such as Race Id and Driver Id. This report will make use of 8 of those data files for the analysis.
Reading all the csv files into their respective variables:
drivers <- read.csv("f1db_csv/drivers.csv")
results <- read.csv("f1db_csv/results.csv")
lap_times <- read.csv("f1db_csv/lap_times.csv")
circuits <- read.csv("f1db_csv/circuits.csv")
constructors <- read.csv("f1db_csv/constructors.csv")
qualifying <- read.csv("f1db_csv/qualifying.csv")
races <- read.csv("f1db_csv/races.csv")This is the Abu Dhabi Grand Prix Race Circuit and it has been
assigned a circuitId of 24
abudhabi_circuit <- circuits %>%
filter(location == "Abu Dhabi") %>%
select(circuitId, name, location, country)
abudhabi_circuit %>% knitr::kable()| circuitId | name | location | country |
|---|---|---|---|
| 24 | Yas Marina Circuit | Abu Dhabi | UAE |
All the races that took place at Abu Dhabi circuit.
abudhabi_races <- races %>%
filter(circuitId == abudhabi_circuit$circuitId)Selecting required columns from the results data set and saving it to
new variable new_results.
new_results <- results %>%
select(raceId,driverId, constructorId, positionText, points, laps, time, fastestLap, fastestLapTime, fastestLapSpeed)
new_results
Selecting the columns required from races and assigning to the
variable new_races:
new_races <- races %>%
select(raceId, year, circuitId, name)
new_races
Intermediary variable to merge new_results and
new_races by raceId.
test <- merge(x=new_results,y=new_races,by="raceId")
test %>%
filter(year==2021)Using the above variables and created a variable with all information about Hamilton’s performance in the 2021 season
hamilton_performance <- test %>%
filter(year==2021, driverId==1)
hamilton_performance <- left_join(hamilton_performance, drivers, by="driverId")
hamilton_performance <- hamilton_performance %>%
select(driverId,code, forename, surname, positionText, points, laps, time, fastestLap, fastestLapTime, fastestLapSpeed, circuitId, raceId, year, name)
hamilton_performance %>% knitr::kable()| driverId | code | forename | surname | positionText | points | laps | time | fastestLap | fastestLapTime | fastestLapSpeed | circuitId | raceId | year | name |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | HAM | Lewis | Hamilton | 1 | 25.0 | 57 | 1:24:28.471 | 50 | 1:25.084 | 227.633 | 78 | 1051 | 2021 | Qatar Grand Prix |
| 1 | HAM | Lewis | Hamilton | 1 | 25.0 | 56 | 1:32:03.897 | 44 | 1:34.015 | 207.235 | 3 | 1052 | 2021 | Bahrain Grand Prix |
| 1 | HAM | Lewis | Hamilton | 2 | 19.0 | 63 | +22.000 | 60 | 1:16.702 | 230.403 | 21 | 1053 | 2021 | Emilia Romagna Grand Prix |
| 1 | HAM | Lewis | Hamilton | 1 | 25.0 | 66 | 1:34:31.421 | 47 | 1:20.933 | 206.971 | 75 | 1054 | 2021 | Portuguese Grand Prix |
| 1 | HAM | Lewis | Hamilton | 1 | 25.0 | 66 | 1:33:07.680 | 54 | 1:20.665 | 208.640 | 4 | 1055 | 2021 | Spanish Grand Prix |
| 1 | HAM | Lewis | Hamilton | 7 | 7.0 | 78 | +1:08.231 | 69 | 1:12.909 | 164.769 | 6 | 1056 | 2021 | Monaco Grand Prix |
| 1 | HAM | Lewis | Hamilton | 15 | 0.0 | 51 | +17.668 | 43 | 1:44.769 | 206.270 | 73 | 1057 | 2021 | Azerbaijan Grand Prix |
| 1 | HAM | Lewis | Hamilton | 2 | 19.0 | 71 | +35.743 | 71 | 1:07.058 | 231.811 | 70 | 1058 | 2021 | Styrian Grand Prix |
| 1 | HAM | Lewis | Hamilton | 2 | 18.0 | 53 | +2.904 | 44 | 1:37.410 | 215.903 | 34 | 1059 | 2021 | French Grand Prix |
| 1 | HAM | Lewis | Hamilton | 4 | 12.0 | 71 | +46.452 | 55 | 1:08.126 | 228.177 | 70 | 1060 | 2021 | Austrian Grand Prix |
| 1 | HAM | Lewis | Hamilton | 1 | 25.0 | 52 | 1:58:23.284 | 45 | 1:29.699 | 236.430 | 9 | 1061 | 2021 | British Grand Prix |
| 1 | HAM | Lewis | Hamilton | 2 | 18.0 | 70 | +2.736 | 49 | 1:18.715 | 200.363 | 11 | 1062 | 2021 | Hungarian Grand Prix |
| 1 | HAM | Lewis | Hamilton | 3 | 7.5 | 1 | +2.601 | 13 | 1063 | 2021 | Belgian Grand Prix | |||
| 1 | HAM | Lewis | Hamilton | 2 | 19.0 | 72 | +20.932 | 72 | 1:11.097 | 215.654 | 39 | 1064 | 2021 | Dutch Grand Prix |
| 1 | HAM | Lewis | Hamilton | R | 0.0 | 25 | 3 | 1:25.870 | 242.864 | 14 | 1065 | 2021 | Italian Grand Prix | |
| 1 | HAM | Lewis | Hamilton | 1 | 25.0 | 53 | 1:30:41.001 | 43 | 1:37.575 | 215.760 | 71 | 1066 | 2021 | Russian Grand Prix |
| 1 | HAM | Lewis | Hamilton | 5 | 10.0 | 58 | +41.812 | 52 | 1:32.763 | 207.160 | 5 | 1067 | 2021 | Turkish Grand Prix |
| 1 | HAM | Lewis | Hamilton | 2 | 19.0 | 56 | +1.333 | 41 | 1:38.485 | 201.521 | 69 | 1069 | 2021 | United States Grand Prix |
| 1 | HAM | Lewis | Hamilton | 2 | 18.0 | 71 | +16.555 | 66 | 1:19.820 | 194.116 | 32 | 1070 | 2021 | Mexico City Grand Prix |
| 1 | HAM | Lewis | Hamilton | 1 | 25.0 | 71 | 1:32:22.851 | 46 | 1:11.982 | 215.503 | 18 | 1071 | 2021 | São Paulo Grand Prix |
| 1 | HAM | Lewis | Hamilton | 1 | 26.0 | 50 | 2:06:15.118 | 47 | 1:30.734 | 244.962 | 77 | 1072 | 2021 | Saudi Arabian Grand Prix |
| 1 | HAM | Lewis | Hamilton | 2 | 18.0 | 58 | +2.256 | 43 | 1:26.615 | 219.495 | 24 | 1073 | 2021 | Abu Dhabi Grand Prix |
Determining which position Hamilton held how many times over the season.
hamilton_performance %>%
group_by(positionText) %>%
summarise(count=n()) %>%
rename(Position = positionText , Races = count) %>%
arrange(as.numeric(Position)) %>%
knitr::kable()| Position | Races |
|---|---|
| 1 | 8 |
| 2 | 8 |
| 3 | 1 |
| 4 | 1 |
| 5 | 1 |
| 7 | 1 |
| 15 | 1 |
| R | 1 |
From the above table we can see that Hamilton has a very consistent performance. He finished 1st in 8 races and 2nd in 8 races. R stands for retired it mean that the car had some problem in the race and it was retired.
Now we’ll do the same thing for Verstappen:
verstappen_performance <- test %>%
filter(year==2021, driverId == 830)
verstappen_performance <- left_join(verstappen_performance, drivers, by="driverId")
verstappen_performance <-verstappen_performance %>%
select(driverId,code, forename, surname, positionText, points, laps, time, fastestLap, fastestLapTime, fastestLapSpeed, circuitId, raceId, year, name)
verstappen_performance %>% knitr::kable()| driverId | code | forename | surname | positionText | points | laps | time | fastestLap | fastestLapTime | fastestLapSpeed | circuitId | raceId | year | name |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 830 | VER | Max | Verstappen | 2 | 19.0 | 57 | +25.743 | 57 | 1:23.196 | 232.799 | 78 | 1051 | 2021 | Qatar Grand Prix |
| 830 | VER | Max | Verstappen | 2 | 18.0 | 56 | +0.745 | 41 | 1:33.228 | 208.984 | 3 | 1052 | 2021 | Bahrain Grand Prix |
| 830 | VER | Max | Verstappen | 1 | 25.0 | 63 | 2:02:34.598 | 60 | 1:17.524 | 227.960 | 21 | 1053 | 2021 | Emilia Romagna Grand Prix |
| 830 | VER | Max | Verstappen | 2 | 18.0 | 66 | +29.148 | 62 | 1:20.695 | 207.581 | 75 | 1054 | 2021 | Portuguese Grand Prix |
| 830 | VER | Max | Verstappen | 2 | 19.0 | 66 | +15.841 | 62 | 1:18.149 | 215.357 | 4 | 1055 | 2021 | Spanish Grand Prix |
| 830 | VER | Max | Verstappen | 1 | 25.0 | 78 | 1:38:56.820 | 58 | 1:14.649 | 160.929 | 6 | 1056 | 2021 | Monaco Grand Prix |
| 830 | VER | Max | Verstappen | R | 0.0 | 45 | 44 | 1:44.481 | 206.839 | 73 | 1057 | 2021 | Azerbaijan Grand Prix | |
| 830 | VER | Max | Verstappen | 1 | 25.0 | 71 | 1:22:18.925 | 68 | 1:08.017 | 228.542 | 70 | 1058 | 2021 | Styrian Grand Prix |
| 830 | VER | Max | Verstappen | 1 | 26.0 | 53 | 1:27:25.770 | 35 | 1:36.404 | 218.156 | 34 | 1059 | 2021 | French Grand Prix |
| 830 | VER | Max | Verstappen | 1 | 26.0 | 71 | 1:23:54.543 | 62 | 1:06.200 | 234.815 | 70 | 1060 | 2021 | Austrian Grand Prix |
| 830 | VER | Max | Verstappen | R | 0.0 | 0 | 9 | 1061 | 2021 | British Grand Prix | ||||
| 830 | VER | Max | Verstappen | 9 | 2.0 | 70 | +1:20.244 | 43 | 1:20.945 | 194.843 | 11 | 1062 | 2021 | Hungarian Grand Prix |
| 830 | VER | Max | Verstappen | 1 | 12.5 | 1 | 3:27.071 | 13 | 1063 | 2021 | Belgian Grand Prix | |||
| 830 | VER | Max | Verstappen | 1 | 25.0 | 72 | 1:30:05.395 | 60 | 1:13.275 | 209.244 | 39 | 1064 | 2021 | Dutch Grand Prix |
| 830 | VER | Max | Verstappen | R | 0.0 | 25 | 25 | 1:25.173 | 244.852 | 14 | 1065 | 2021 | Italian Grand Prix | |
| 830 | VER | Max | Verstappen | 2 | 18.0 | 53 | +53.271 | 28 | 1:38.396 | 213.959 | 71 | 1066 | 2021 | Russian Grand Prix |
| 830 | VER | Max | Verstappen | 2 | 18.0 | 58 | +14.584 | 53 | 1:32.759 | 207.169 | 5 | 1067 | 2021 | Turkish Grand Prix |
| 830 | VER | Max | Verstappen | 1 | 25.0 | 56 | 1:34:36.552 | 52 | 1:39.096 | 200.278 | 69 | 1069 | 2021 | United States Grand Prix |
| 830 | VER | Max | Verstappen | 1 | 25.0 | 71 | 1:38:39.086 | 52 | 1:18.999 | 196.134 | 32 | 1070 | 2021 | Mexico City Grand Prix |
| 830 | VER | Max | Verstappen | 2 | 18.0 | 71 | +10.496 | 47 | 1:12.486 | 214.005 | 18 | 1071 | 2021 | São Paulo Grand Prix |
| 830 | VER | Max | Verstappen | 2 | 18.0 | 50 | +11.825 | 35 | 1:31.488 | 242.943 | 77 | 1072 | 2021 | Saudi Arabian Grand Prix |
| 830 | VER | Max | Verstappen | 1 | 26.0 | 58 | 1:30:17.345 | 39 | 1:26.103 | 220.800 | 24 | 1073 | 2021 | Abu Dhabi Grand Prix |
verstappen_performance %>%
group_by(positionText) %>%
summarise(count=n()) %>%
rename(Position = positionText , Count = count) %>%
arrange(as.numeric(Position)) %>%
knitr::kable()| Position | Count |
|---|---|
| 1 | 10 |
| 2 | 8 |
| 9 | 1 |
| R | 3 |
From the above table we can see that Verstappen won 10 races in 1st position and 8 races in 2nd position. That is better performance than Lewis Hamilton. However he lost a lot of points because his car retired in 3 races.
hamilton_hist <- ggplot(hamilton_performance, aes(x=as.numeric(positionText))) + geom_histogram(bins=30,binwidth = 0.5) + labs(x="Position",title="Hamilton's positions in 2021 season") +
scale_y_continuous(limits = c(0, 10), breaks = seq(0, 10, 1)) +
scale_x_continuous(limits = c(0, 20), breaks = seq(0, 20, 2))
verstappen_hist <- ggplot(verstappen_performance, aes(x=as.numeric(positionText))) + geom_histogram(bins=30,binwidth = 0.5) +
labs(x="Position",title="Verstappen's positions in 2021 season") +
scale_y_continuous(limits = c(0, 10), breaks = seq(0, 10, 1)) +
scale_x_continuous(limits = c(0, 20), breaks = seq(0, 20, 2))
combined_plots <- hamilton_hist+verstappen_hist
combined_plotsCleaning the drivers datafile and selecting required columns:
new_drivers <- drivers %>%
select(driverId, forename, surname, code, number)
new_driversSelecting the constructors datafile and selecting just the names and Id.
new_constructors <- constructors %>%
select(constructorId, name)
new_constructorsMerging all the data variables into one.
abudhabi_grand_prix has the data of all the races and
drivers that have ever raced on this track.
abudhabi_grand_prix <- abudhabi_circuit
abudhabi_grand_prix <-left_join(abudhabi_grand_prix, new_races, by="circuitId")
abudhabi_grand_prix <-left_join(abudhabi_grand_prix, new_results, by="raceId")
abudhabi_grand_prix <-left_join(abudhabi_grand_prix, new_drivers, by="driverId")
abudhabi_grand_prix <-left_join(abudhabi_grand_prix, new_constructors, by="constructorId")
abudhabi_grand_prixVerstappen’s career started in 2014 so we will only consider those
races that took place between 2014 and 2022. Assigning this to a new
variable abudhabi_grand_prix_clean.
abudhabi_grand_prix_clean <- abudhabi_grand_prix %>%
select(year,positionText,points,fastestLap, fastestLapTime, forename, surname, name.y, name, driverId) %>%
filter(year > 2014 & year < 2022)
abudhabi_grand_prix_cleanLet’s see how many times each driver has won on this track.
abudhabi_grand_prix_clean %>%
filter(positionText == 1, surname=="Hamilton" | surname == "Verstappen") %>%
group_by(surname) %>%
summarise(wins = sum(as.numeric(positionText))) %>% knitr::kable()| surname | wins |
|---|---|
| Hamilton | 3 |
| Verstappen | 2 |
Historically Hamilton has won 3 times and Verstappen has won twice on this race track.
Now lets calculate how many points they have scored overall on this track.
abudhabi_grand_prix_clean %>%
filter(positionText == 1, surname=="Hamilton" | surname == "Verstappen") %>%
group_by(surname) %>%
summarise(points_scored = sum(points)) %>% knitr::kable()| surname | points_scored |
|---|---|
| Hamilton | 76 |
| Verstappen | 51 |
As expected, Hamilton has scored more points because of more wins. But considering that Verstappen’s carrer only started in 2014 he has scored 2 wins already.
Let’s also determine the constructor points scored on this race track.
abudhabi_grand_prix_clean %>%
filter(name == "Red Bull" | name == "Mercedes") %>%
group_by(name) %>%
summarise(total_points = sum(points)) %>% knitr::kable()| name | total_points |
|---|---|
| Mercedes | 261 |
| Red Bull | 157 |
Mercedes has a proven track record of consistently scoring on this
track. It has scored 261 points and Red Bull has scored 157 in a period
of 7 years.
These are the race statics of just Hamilton and Verstappen in the Abu Dhabi Grand Prix 2021
abudhabi_gp_21 <- abudhabi_grand_prix_clean %>%
filter(year == 2021, surname == "Hamilton" | surname == "Verstappen")
abudhabi_gp_21 %>% knitr::kable()| year | positionText | points | fastestLap | fastestLapTime | forename | surname | name.y | name | driverId |
|---|---|---|---|---|---|---|---|---|---|
| 2021 | 1 | 26 | 39 | 1:26.103 | Max | Verstappen | Abu Dhabi Grand Prix | Red Bull | 830 |
| 2021 | 2 | 18 | 43 | 1:26.615 | Lewis | Hamilton | Abu Dhabi Grand Prix | Mercedes | 1 |
Interestingly Max Verstappen has the fastest lap time of 1:26:103 that about 500 millisecond faster than Lewis Hamilton.
This is the detailed lap statistics of the 2021 Abu Dhabi Grand Prix (raceId:1073) for Hamilton (driverId:1) and Verstappen (driverId:830)
ab_gp_laptimes <- lap_times %>%
filter(driverId == 1 | driverId == 830, raceId==1073)
ab_gp_laptimesab_gp_laptimes <- left_join(ab_gp_laptimes, new_drivers, by ="driverId")Let’s calculate the average lap time for both the drivers
ab_gp_laptimes %>%
left_join(abudhabi_gp_21, by="driverId") %>%
group_by(surname.x) %>%
summarise(Average_laptime_in_seconds = mean(milliseconds)/1000) %>% knitr::kable()| surname.x | Average_laptime_in_seconds |
|---|---|
| Hamilton | 93.4414 |
| Verstappen | 93.4025 |
Here we can notice that Verstappen is about 400 millisecond faster than Hamilton on an average.
Let’s consider the qualifying results to determine which driver has a better grasp on the track.
new_qualifying <- left_join(qualifying, new_races, by = "raceId")
new_qualifying <- left_join(new_qualifying, new_drivers, by = "driverId")
new_qualifyingqualifying_plot <- new_qualifying %>%
filter(name == "Abu Dhabi Grand Prix", driverId == 1 | driverId == 830, year>2014 & year<2022) %>%
ggplot(aes(x=year,y=q3, color=surname)) + geom_point(size=8) +
scale_x_continuous(limits = c(2015, 2021), breaks = seq(2015, 2021, 1)) +
geom_text(aes(label = q3), color = "black", fontface = "bold", vjust = 1.5,
hjust = -0.2, size = 3) +
labs(x="Year", y = "Q3 Times",title = "Qualifying times for Hamilton and Verstappen", subtitle = "Q3 - 2015 to 2020")
ggplotly(qualifying_plot)Here we can see that over the years, both drivers have improved their Q3 times. But it is clearly noticeable that Verstappen actually was faster in the year 2020 and 2021.
From all the findings and obvervations above it is fair to conclude that Verstappen has a consistently amazing performance over the season. Moreover, his performance on the Abu Dhabi circuit was also commendable with average lap times less than Hamilton’s which makes him faster than him and better performance in Q3 with faster laptimes.
There were controversies regarding the race director allowing outlapped cars to overtake the safety car which was certainly an advantage to Max Verstappen. But considering Max’s performance it seems likely that he would have overtakes those cars when the race restarted anyway and there was high chance that he still would have won the race.
So it is evident that it is unfair to say that Hamilton would have won if the cars weren’t allowed to overtake the safety car.4 Verstappen’s performance has been on point and he seems a bit faster than Hamilton overall. Hence it is likely that Verstappen would have won the Abu Dhabi Grand Prix 2021 and subsequently his first World Championship title!